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Abstract 

We describe a simple method to compute the Cramer-Rao limit of a high energy 
experiment, i.e., the smallest error with which a parameter can in principle be deter- 
mined in a reaction. This precision remains a theoretical paradigm since it assumes 
perfect experimental conditions. Nevertheless, it is shown at hand of an example 
that for simple processes this asymptotic resolving power can be approached very 
closely. In all situations, the procedure is at least a useful test of what could and 
what cannot be measured by studying a particular reaction. 



1 Introduction 



It is customary in high energy physics to anticipate experimental results and 
to determine many years in advance of an experiment how precisely it can 
measure a parameter. For instance, in the past few years a true industry has 
been developed to estimate the discovery potential of LEP II. In particular, 
the reaction e + e~ — ► W + W~ is a prime candidate for testing anomalous gauge 
couplings, since it involves the as yet unprobed WW^ and WWZ vertices. 
Typically, one assumes a particular form for these couplings (generally, their 
standard model prediction) and then proceeds to determine the expected ex- 
perimental error bounds around this central value. 

In general, this procedure depends on four ingredients: 

- A theory (e.g., the standard model, its supersymmetric extension, etc.) 
which depends on one or more parameters (couplings, masses, etc.). It is 
the precision with which these parameters can be determined we wish to 
compute. 

- A reaction characterized by its initial and final state (e.g., e + e~ — > 
with or without polarization) . This reaction should of course be as sensitive 
as possible to the values taken by the parameters. 

- An observable of this reaction (e.g., the total cross section, asymmetries, 
etc.). It should obviously also depend as much as possible on the parameters. 

- A consistent, unbiased and efficient statistical estimator. It is generally cho- 
sen to be a least squares or maximum likelihood estimator, which are both 
equivalent and optimal in the asymptotic limit. 

The issue we wish to address here is how to optimize the last two of these four 
items. For this we shall assume a perfect experiment with no other errors than 
statistical ones. We shall introduce a theoretical observable and a statistical 
estimator, which yield the smallest possible error on the parameters that can 
be obtained with a given amount of data. This theoretical limit is nothing 
else but the Cramer- Rao minimum variance bound |TJ. It clearly defines a 
boundary between what in principle can be achieved and what certainly cannot 
be achieved, by studying a particular reaction. In the experimental practice, 
of course, it remains the task of the physicist to make use of an observable 
(or a set of observables) which yields a sensitivity that comes close to this 
asymptotic resolving power. 

In the next Section we define the xlo estimator, which computes the Cramer- 
Rao limit of the error in the determination of a parameter. In Section 3, we use 
this criterion to derive limits for an electric dipole moment of the electron in a 
high energy M0ller scattering experiment. Because this reaction is particularly 
simple it allows the derivation of analytical formulae which nicely exhibit some 
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general features of the procedure. In Section 4, we consider a similar analysis 
in Compton scattering. This example will display how realistic a goal the 
result of the estimator can be when the phase space is larger. Finally, we 
recapitulate in the Conclusion the aim and the domain of applicability of this 
estimator. 



2 The Cramer-Rao Limit 

Let us consider a generic high-energy scattering experiment and a theory which 
by assumption is the correct one. For simplicity we concentrate here on the 
determination of a single parameter p of this theory. It is straightforward to 
extend all results to follow to the case where several parameters are involved. 
The true value of the parameter is p. 

We wish to determine the range of values of p which would be indistinguishable 
from p when a particular measurement is performed. For example, one could 
compare the total predicted rates n(p) and n(p). The values of p for which 



cannot be distinguished from p to better than \i standard deviations. The 
average numbers of events n are computed by integrating the differential cross 
sections over the final state phase space Q which can be explored by the 
experiment: 



where C is the time integrated luminosity. If systematic errors can be neglected 
the numbers of events are distributed according to Poisson statistics, and the 
standard deviation in Eq. (|l|) is given by 



n(p) ~ XiAn(p) < n{p) < n(p) + XiAn(p) 





(2) 




(3) 



In order to allow an easy generalization, we can rewrite Eqs (Q,^) as a least 
squares estimator 
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The probability that a measurement of p deviates from p is quantified by xi : 
the computed interval of p for which \\ is less than a certain number (say 
2.71) will contain a measured value of p with the corresponding confidence 
level (here 90%). The size of this interval is the precision with which the 
parameter can be determined by measuring the total cross section. 

The extent of this error band around p depends of course on the value of p. If 
experimental data is available, p is taken to be the best fit of p to this data. In 
the absence of actual dataQ though, the value of p is the result of an educated 
guess or a theoretical bias, typically, the standard model expectation. 

Up to now only a very small portion of the available information has been 
used. Indeed, it might well be that two very different values of p yield the 
same number of events. Still, these events might have significantly different 
topologies. Upon integrating over the whole phase space in Eq. (f|), these 
differences are completely washed out. Striking examples of this phenomenon 
have been discussed in Refs 0. 

Clearly, it would be advantageous to include at least some of the information 
contained in the event shape. This is usually done by considering asymmetries 
or by dividing the phase space into a certain number N of intervals of one 
or several kinematical variables AQi (i = 1 . . . N). The previous least squares 
estimator can then be applied separately to each bin in these kinematical 
variables: 



where the index % denotes a particular phase space bin and N is the total 
number of bins. This is a standard procedure which can substantially improve 
the resolving power of an experiment ||. Indeed, because of the triangle in- 
equality x% can om y grow with the number of bins N and one always has 



Of course, strictly speaking the quantitative probabilistic interpretation of this 
analysis is only valid as long as the number of bins is not excessive and each 
bin contains a certain minimum number of events, typically five. Indeed, a \ 2 
distribution is defined to be the weighted sum of the squares of independent 

1 This is the situation we consider from now on. 
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gaussian distributions. However, if too many too small bins are used, this 
definition is not obeyed for two reasons: 

A The binning of the final state phase space takes place with a certain instru- 
mental error, which introduces some amount of bin-to-bin correlation. The 
numbers of events in different bins are thus not completely independent. 

B The number of events in each bin is in reality distributed according to a 
Poisson distribution, which assumes only asymptotically a gaussian shape. 

Obviously, if the number of bins is taken to be so large that the calculated 
number of events in some bins is less than one, the whole procedure stops 
making sense. 

Notwithstanding this limitation, let us increase (at least on paper) the number 
of bins to infinity! In this limit the number of events per bin 

Afii (6) 



is infinitesimally small and \n © becomes 
fdajp) da(p)\ 2 

dQ 



Comparing this with \\ PD> we see that in essence the square of an integral 
became the integral of a square. Clearly 

xl > xl > Xl , (8) 



so xio i s the most sensitive estimator of p. 

Because in some sense we assumed an infinite data sample when taking the 
limit @, this is the asymptotic resolution which could also be obtained by 
the maximum likelihood method. Indeed, defining the probability density 

1 da 
a dll 



when p is in the neighbourhood of p, xlo © can be rewritten in the linear 
regime^ as 

2 i.e. either if the dependence of p(p) on the parameter p is linear or if the con- 
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xl = n( P -p) 2 Jdn 1 - ( 



~\2 



(p-p) 



1 ( dp s 
dp, 

d 2 lnL \ 



dp 2 



(10) 



which is nothing but the maximum likelihood estimator |]T|, where 
L = f[ p(^) 



i=l 



is the maximum likelihood function. 

To see that this is indeed the Cramer-Rao minimum variance bound, we set 
xto = 1 in Eq. ([TO]) . Discretizing again into phase space bins, we obtain for 
the dispersion of p around p 




By definition, rij is the average number of events in bin i. The observed number 
of events Ni in this bin is distributed according to Poisson statistics, i.e., 



Pi = ^77^ (13) 



N, 



is the probability to find iVj events in bin i. Assuming there are no bin-to-bin 
correlations, we have 



< ^ > = m (14) 

< (Ni - ni)(Nj - rij) > = S^n, (15) 
and we can rewrite 



j 



u^-rt - (i6) 



sidered values of p are close enough to p to warrant sufficient linearity 
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This is nothing but the Cramer-Rao minimum variance boundQ 

dp 



D{pr J(^Jm)\. (17) 



To derive this result, we only assumed the absence of bin-to-bin correlations 
in Eq. fllSP . No assumption concerning the population of the bins is necessary. 
Although we used the linear approximation in Eq. (|T2D, Eq. (|7|) remains valid 
even when the parameter dependence is far from linear, which is often the 
case when the luminosity C is small. In contrast, the relations ( |TUD assume 
a linear parameter dependence because they are derived from the maximum 
likelihood covariance matrix. 

In the presence of real data the maximum likelihood function (|TT|) can easily 
be evaluated with all experimental resolutions and efficiencies folded in 
The linear approximation is then not any longer necessary since the confidence 
intervals can be estimated without having recourse to the covariance matrix. In 
contrast, the estimator can of course not be applied experimentally, since it 
assumes (A) the absence of systematical errors and (B) sufficient statistics to 
fill infinitesimal bins. These limitations, however, only emphasize the fact that 
xlo yields the theoretical Cramer-Rao limit of what can be measured by the 
reaction. In other words, any data analysis of a particular reaction, however 
clever, cannot yield a more precise determination of a given parameter than 
the asymptotic accuracy yielded by the xL> estimator. 

If the systematic errors can be neglected with respect to the statistical error, 
the Cramer-Rao bound predicted by the xlo estimator (0) can be experimen- 
tally reached with a maximum likelihood analysis. However, if the systematic 
errors are large, the question arizes, how close can one come in practice to the 
theoretical precision given by the xlo estimator? There is no general answer 
to this question and a separate analysis has to be performed for each case. 
This issue is addressed in the next Section at hand of a simple example. 



3 Electric Dipole Moment of the Electron in M0ller Scattering 



To illustrate how the xL> estimator works in practice, let us analyze a particu- 
larly simple example. If the electron is a composite particle, its non-elementary 
nature might reveal itself at energies far below it's binding energy by an elec- 
tric dipole moment d. This dipole plays now the role of the parameter p. The 



3 I am indebted to Sergey Alekhin for pointing out this derivation to me. 
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electron-photon coupling is then described by the effective lagrangian 



where e and d are the electromagnetic charge and electric dipole moment of 
the electron, = d fl A u — d u A fl is the strength of the electromagnetic field A^ 
and a^ u = (7^7" — 7 I/ 7 M )- The first term in the lagrangian ( |I8D represents the 
standard point-like electron-photon coupling, whereas the second term arizes 
from new interactions. 

The static limit for such an electric dipole of the electron is very tightly con- 
strained by low energy experiments ||. However, such a dipole term might 
well assume large values for high momentum transfers ||, if it behaves as a 
function of the photon virtuality Q 2 as 

(19) 



where A is the scale of new physics. 

To probe this electric dipole moment of the electron, let us consider a polarized 
M0ller scattering experiment. It has the virtue of being particularly simple 
and to allow the description of some important features of the xio estimator 
with handy analytic formulae. The e~e~ reaction takes place at lowest order in 
perturbation via the t- and u-channel exchanges of a photon or a neutral vector 
boson Z°. In the absence of transverse polarization the final state phase space 
is one-dimensional. Neglecting the mass of the electron and terms of 0(d 4 ), 
the differential cross section for left-polarized electron beams becomes 

da e 4 1 A d 2 s . 2 2 x 



1 + — sin 2 cos 2 , (20) 



d cos 9 us sin 4 6 \ 2e 2 



where 9 is the polar angle of the emerging electrons and is the centre 
of mass energy. To derive Eq. (|20|) we have ignored the Z° exchange. This 
approximation doesn't introduce any qualitative change, but has the virtue of 
keeping the analytic expressions simple. In our numerical calculations the Z° 
is of course taken into account. 

Such a M0ller scattering experiment will be possible at one of the linear col- 
liders of the next generation (CLIC, JLC, NLC, TESLA,. . . ). To be specific, 
we concentrate here on the canonical design with a centre of mass energy 
y/s = 500 GeV and an integrated luminosity £ = 10 fb _1 . In practice, also, 
the scattered electrons can only be observed at a certain angle away from the 
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beampipe. We therefore impose the angular cut 



cos#<l-e. (21) 

Of course, the resolving power of this reaction depends on the true value d 
of the parameter. In Fig. |l| the 90% confidence level error band around d 
(derived from x\i xL = 2.71) is plotted as a function of d. Since only \d\ 2 can 
be observed in this experiment, the plot extends in the same way in the three 
other quadrants. For (not too) large values of d the resolution scales like 




C 

Indeed, the expression for x\ approaches in the limit of a vanishing cut e 
Xl^-(d 2 -d 2 ) 2 2e. (22) 

The reason why x\ has no sensitivity when the whole kinematical range is 
inspected (e — > 0), can be traced back to the fact that the dipole moment in- 
duces no singularity along the beampipe, in contrast to the point-like coupling. 
If small angle electrons are also considered, the standard model background 
keeps increasing whereas the dipole signal does not improve. The collinear 
divergence of the standard model cross section is eventually regulated by the 



mass of the electron. Strictly speaking, thus, x\ i n dH) converges to a very 



small but finite value. For our purposes, though, this effect is of no importance. 

The angular cut (^l|) could be optimized (cf. Fig. |3|) to maximize x\ 0- 
However, a partition of the angular range into a reasonable number N of bins 
automatically takes care of this task. For the asymptotic limit we find the 
approximate result 

£ = T' (23) 



This is the theoretical limit which can only be approached from below by any 
experimental setup. 

To study the improvement of x% with increasing number of bins, let us assume 
the validity of the standard model, i.e., d = 0. This way we test the limit of 
observability of the electron's electric dipole moment. The deviations from 
d = which can be observed with a certain level of confidence (say again 
90%) are the values of d which yield a x 2 m excess of a given number (here 
again 2.71). In Figs ^] and [3] the d A dependence and the angular cut e behaviours 
of xl and xlo can be observed to agree with Eqs ([22]) and (1231) . 
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It also appears from Fig. |], where x% * s plotted as a function of the number 
of bins N, that with only 30 bins one comes within 90% of the asymptotic 
resolution. Because the event rates of this reaction are so large, however, the 
error is in this case dominated by systematics. Assuming for this very clean 
experiment a .1% systematic error, the expected results are displayed by the 
dotted curve in Fig. f|. 



4 Conclusions 



We have presented a simple xlo estimator to evaluate the potential of a reac- 
tion for studying parameters. The estimator reveals the highest accuracy this 
reaction could provide under ideal conditions, for determining the numerical 
values of these parameters: the Cramer-Rao bound. 

This estimator does not make any claim about the precision to be obtained 
under normal running conditions, except that it can never be better. In prac- 
tice, however, this limit can be closely approached by a maximum likelihood 
data analysis, if the systematic errors are not too large. 

Since the xto estimator provides a bound on what precision can be achieved 
by a particular reaction in the best of all cases, it is a safe measure to compute 
this number before embarking on a more time consuming detailed analysis. 
It can then be decided whether or not this reaction has at all a chance to 
compete in precision with others. 
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Fig. 1. Dependence of the resolving power on the actual value of the electric dipole 
moment of the electron d. The resolution with one bin (total cross section) and an 
infinite number of bins (the Cramer- Rao limit) are given by the thinner and thicker 
curves respectively. 
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ig. 4. Dependence of x% on the number of bins N. The effect of systematic 
shown by the dotted curve. 
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